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Abstract. For high dimensional problems, such as approximation and inte- 
gration, one cannot afford to sample on a grid because of the curse of di- 
mensionality. An attractive alternative is to sample on a low discrepancy set, 
such as an integration lattice or a digital net. This article introduces a mul- 
tivariate fast discrete Walsh transform for data sampled on a digital net that 
requires only 0{N log N) operations, where A'^ is the number of data points. 
This algorithm and its inverse are digital analogs of multivariate fast Fourier 
transforms. 

This fast discrete Walsh transform and its inverse may be used to ap- 
proximate the Walsh coefficients of a function and then construct a spline 
interpolant of the function. This interpolant may then be used to estimate 
the function's effective dimension, an important concept in the theory of nu- 
merical multivariate integration. Numerical results for various functions are 
presented. 



1. Introduction 

The idea of the fast Fourier transforms goes back to Gauss and has been popular 
ever since the seminal work of Cooley and Tukey [5] . Let / be a function from 
[0, 1] to the complex numbers. The task is to compute 

J{k) = f{n/N)c^'''''"/^ for A: = 0, . . . , TV - 1. 

n=0 

These are TV sums, each consisting of iV summands. Hence a straight forward cal- 
culation would have complexity of 0{N'^) operations. But the sums have a certain 
structure which can be exploited. Indeed Cooley and Tukey showed that those 
sums can be computed with 0{N log N) operations. (There is some dependence of 
the implied constant on the number N; the algorithm works best if is a prime 
power, see |5j). 

In higher dimensions an effect commonly referred to as the curse of dimension- 
ality occurs. Let / : [0, 1]'' — > C and consider the discrete Fourier transform 

p-i 

/(fc)= V /(ni/p,...,n,/p)e2-("^'=^+-+"='==)/j' 
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for all k — {ki, . . . , ks) G {0, . . . ,p ~ 1}'^. Here the number of points sampled is 
N = p^. Hence if s is large, say 100 for example, then choosing even p = 2 yields 
j\f _ 2100 ^ ^q30 points, making such a computation infeasible for contemporary 
computers. 

In the example above the design or set of sample points is a grid aligned with 
the coordinate axes, {(ni/p, . . . , Us/p) : < Uj < p}. To avoid the curse of dimen- 
sionality one needs a much smaller point set that is constructed differently. Such 
point sets have previously been considered in the context of numerical integration, 
see [Tallin]- Two popular construction methods are integration lattices (see [T5] ) 
and digital nets and sequences (see This article focuses on the latter fam- 

ily of points. (Numerical approximation using lattice rule designs and an FFT 
has been treated in [121 123 •) The first examples of digital sequences were given 
by Sobol [20] and Faure [9] before Niederreiter introduced the general concept of 
(i, m, s)-nets and {t, s)-sequences. See |15| for a recent survey. These construc- 
tions yield extremely well distributed point sets if the quahty parameter t is small. 
Digital {t, TO, s)-nets are a special construction of {t, to, s)-nets, and in the same 
way, digital (t, s)-sequences are a special construction of (t, s)-sequences. Digital 
constructions are introduced below. 

Definition 1.1. Let Zp be a finite field of prime order p, let Ci,...,Cs be s 
TO X m-matrices over Zp — {0, 1}. The digital {t,m,s)-net P{C) = 

{ajQ, • . . , aJp^-i}, based on C = (Ci,...,Cs), is then defined as follows: let < 
n < and n = tiq + nip + • • • -I- n„j_ip™~^ be the base p representation of n. 
Define n = (no, . . . , rim-i)^ G Z™ and let 

(1.1) y,,„-QneZ™. 
Express y^vn as {yj ^n,i, • . • , yj,n,m)^ G ^p": ^ud then define 

*^j,n — yj,n,lP ^" ■ ■ ■ ^ yj.n,mP 

The n-th point a;„ of the digital net P{C) over the finite field Zp is given by cc„ = 

(•^l,n ; ■ ■ ■ ; -^s^n ) ■ 

The t value is a non-negative integer such that for all < di, . . . ,ds < m — t 

with di + • ■ • + ds = m — t the system of vectors c"i^ , • . • , c"J^ , ■ • ■ , cf^ , ■ • ■ , c"d^ 

(i) 

hnearly independent over Zp. Here Cj, refers to the fc-th row of the matrix Cj. For 
a geometrical interpretation of the t value see, for example, p3] . Smaller values of 
t characterize more uniformly distributed nets. 

Digital nets are often used in conjunction with certain wavelets, namely Haar 
functions, first used by Sobol [21], and Walsh functions, first used by Larcher [10] 
and Larcher and Traunfellner [llj. 

Next, Walsh functions, which are piecewise constant, in base p are defined. For 
more information on Walsh functions see for example [U [24] (or in the context 
of numerical integration see [B]). Throughout this article let No denote the set of 
non- negative integers and let N denote the set of positive integers. 

Definition 1.2. Let p > 2 be an integer. For a non-negative integer wavenumber 
k with base p representation 

k = ko + kip H h ka-ip"^^, 
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with ki e Zp, the Walsh function pwaU : [0, 1) — > C is defined by 

pwalfc(x) :=^-ifco+-+-„fc_,^ 

where ujp — e^'^'/P, for x G [0, 1) with base p representation x — xi/p + X'z/p^ + • • • 
(unique in the sense that infinitely many of the Xi must be different from p~l). If 
it is clear which base p is chosen we simply write walk ■ 

Definition 1.3. For dimension s > 2, xi, . . . ,Xs E [0, 1) and /ci, . . . , fc^ G No define 
pwaUi,...,fc, : [Oi 1)^ — > C by 

S 

pwaUi^...,fc^(a;i, . . . pwalfe^.(xj). 

For wavenumber vectors k — {ki, . . . ,ks) G Ng and x — (xi, . . . , Xs) £ [0, 1)'' we 
write 

pwalfc(a;) := pwalfej,...,fc^ (xi, . . . , a;,,). 
Again, if it is clear which base is meant we simply write walfc(a;). 

Let a Walsh series / G C2{[0, 1]*) be defined by 

fix) = /(fc)walfc(a;), 

where the Walsh coefficients are given by 

(1.2) /(fe) = / /(a;)walfc(a;)da;, 

-'[0,1]= 

since the Walsh functions are mutually orthonormal. Here a denotes the complex 
conjugate of a complex number a. 

The aim now is to approximate Walsh coefficients of a function / with wavenum- 
bers lying in a certain set of wavenumbers, /C(C), depending on the digital net P{C) 
defined by the generating matrices C = (Ci, . . . , Cs). The details of how /C(C) is 
chosen is explained in the next section. A digital net with points can be used 
to estimate |/C(C)| = Walsh coefficients, where | • | denotes the cardinality of a 
set. For k G /C(C) we approximate /(fc) by the finite sum 

(1-3) m = ^ E /(^)^^M^- 

xGP(C) 

We call f{k) the discrete Walsh coefficients because (|1.3p is just a discrete version 
of (|1.2p . Those discrete Walsh coefficients provide us with valuable information 
about the function at hand. 

A naive calculation of the p™ discrete Walsh coefficients f{k) with k G IC{C) 
would require ©(p^™) operations, but using the fast discrete Walsh transform algo- 
rithm described in the following section, we can reduce it to 0(mp™' logp) opera- 
tions. In Section [3] the discrete Walsh coefficients are used to interpolate functions 
based on observations on the digital net design. The inverse discrete Walsh trans- 
form then provides an interpolatory approximation of the original function. The 
ANOVA decomposition of this interpolation provides information about the effec- 
tive dimension of the function as explained in Section U) In the last section the fast 
discrete Walsh transform is used to approximate Walsh coefficients and effective 
dimensions of some explicit test functions. 
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2. Multivariate fast discrete Walsh transform over digital nets 

In this section we introduce an FFT-like algorithm for a multivariate fast discrete 
Walsh transform over digital nets. An essential role is played by the dual net of a 
digital net. 

2.1. The Dual Net. Let C = (Ci, . . . ,Cs) be the vector of matrices generating a 
digital {t, TO, s)-net P{C) over a finite field Zp, where p is prime. For any wavenum- 
ber vector fe = (fci, . . . , fcs) g Ng, we define 

Ck^ C^k^""^ + ■■■ + cjki^'> e z^, 

where for kj = kj_o + kj,ip + ■ ■ ■ we define — (fcj,o, ■ • ■ , fcj,m-i)"^ G as the 
TO-element truncation of kj, and all operations are carried out in the finite field Zp. 
The dual net I?(C) of the digital net P{C) is the set of all wavenumbers which make 
this dot product zero: 

V{C) ^{keWo-.C -k^O}. 

The dual net appears in the worst-case error for multivariate integration in certain 
Walsh spaces, see [B]. Therein the worst-case error is just the sum of a certain 
function over all elements in the dual net except 0. 
The dual net satisfies 

(2.1) — > walfc(a;) = < „ ^, . 

^ ' rjni ^ ^ 1 otherwise, 

which was shown, for example, in fB^. Indeed, this property could also be used to 
define the dual net. 

For k € No with k ^ ko + kip + ■■ ■ let j/(0) = and for A: > let iy{k) = 
1 + max{? : ki ^ 0}. For k = (fci, . . . , fc^) g define vik) = J2i=i ^{^i) (see 
[131I16]). The function v is a. norm on the elements in the wavenumber space. It 
depends on the most significant bit of the coordinates and can be related to the 
i- value of a digital net, see [l6| . 

For non-negative integers fc,/ G No with k = kQ + kip+- ■ ■ and I — Iq + Iip+- • • a 
digit- wise addition and subtraction in base p can be defined by = ao + aip+ • • • 
where ai = ki + li (mod p) and k Q I = bo + bip + ■ ■ ■ where hi = ki — li (mod p). 
For non-negative integer vectors the digit-wise addition and subtraction are defined 
component-wise. 

Using this digit-wise addition and subtraction we obtain a group structure on 
Nq of which the dual net 2?(C) forms a subgroup. It can be checked that the cosets 
of the subgroup V{C) are given by 

V{h) ^{k^Wo-.C-k^h} 

for h g Z™ and hence there are cosets. Note that I?(0) — 'D{C), i.e., the coset 
containing is the dual net. The set of wavenumbers whose Walsh coefficients are 
to be approximated, /C(C), is then obtained by choosing exactly one representative 
in each coset. For each h g Z™ identify k g I?(/i) such that v{k) < v{l) for all 

I g Vih). This k is the representative of V^h) chosen to be in K.{C). In the case of 
more than one k from the same coset satisfying this condition, one may choose, for 
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example, the k that is the smallest in lexicographic order. That is, 

/C(C) = |fe e No : fe G V{h) for some h e Z^", and for any I e V{h) 

we have J'(fe) < v{l) and if v{k) ~ v{l) for some I G 'D{h), then 

ki — li, . . . , fcj-i = Ij-ij kj < Ij for some j — I, . . . , s.} 

This definition implies that the zero vector is automatically in /C(C), and that Nq 
is the direct sum of IC{C) and I?(C). 

2.2. Multivariate fast discrete Walsh transform over digital nets. Let a 

Walsh series f £ £2([0, 1]'*) be given by 

fix) = X! /(^)walfc(a;). 

For all k €E JC{C) one may approximate /(fe) by the discrete Walsh transform 
(DWT), f{k), which is defined as 

•^e^) ^ ^ fix)'^^^kix) = ^ f{l)-^ walj(a;)walfc(a;). 

Note that wali(a;)walfc(a;) — waljefc(a;) (see [6]), and hence the rightmost sum in 
the equation above is one if I Q k E 'D{C) and zero otherwise. Thus it follows that 

(2.2) J{k)^f{k)+ J2 

iev{c)\{o} 

Hence, the terms f{k © I) are completely aliased with each other for all I E T^{C). 
We have chosen k E /C(C) such that k is closest to 0. Hence if higher frequency 
contributions are sufficiently small, that is, f{k) decays sufficiently fast the further 
k is away from with respect to the norm v, then f{k) w f{k). 

A straightforward calculation of the discrete Walsh coefhcients would require 
0{p^™') operations, as we have sums to compute (one sum for each k E IC{C)) 
and each sum requires 0{p"^) operations. But as shown below, certain parts in the 
summation above can be reused and thereby reducing the number of operations. 

Let Xn be the n-th point of the digital net P{C) and let k be the unique element 
of V{h) n /C(C). Then we have 

walfc(a;„) = uj^''\ 

because 

s oo 

yj.n,ikj,i = {kfCi H h kJCs)n = h'^n. 

Hence 

^ n=0 

where h = C ■ k. The above sum may be written as 

n^-i— ni— no— 
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Now, computing first the innermost sum for each /iq = 0, . . . ,p — 1, then the second 
innermost sum for each hi = 0, . . . ,p — 1 and so on yields an algorithm which needs 
only 0{mp"^^^) operations. The details are given as follows. 

Algorithm 1 (Fast Discrete Walsh transform (FWT)). For no, . . . ,nm-i € Zp 
we define G(°)(no, . . . , = /(a;„Q+...+„,^_jp™-i ). Then for r = 1,2, 

compute for all rir, . . . , rim-i € and all h^, . . . , /i^-i G Zp the sums 

(2.3) G'-''\ho, . . .,hr-i,nr, . . . ,nm-i) 



, m 



p-i 

E - 

nr-l=0 



p "-I'^-iG^'- 1) (ho, . . . , hr-2, rir-i, n^, . . . , rim-i)- 



For k e /C(C) with C • A; = /i let 

pm 

Note that in each step one needs C'(p™+-'^) operations, and as there are m steps, 
one needs 0(771^™+^) operations altogether. Note also that the number of terms in 
the summation in the right side of (|2.3p is p, which is a prime number. The index 
rir-i = 1, . . . ,p — 1 (and hr-i) forms a group under the multiplication modulo p. 
Thus, following the ideas of Rader's algorithm J^, we can rewrite the indices as 
rir-i — 5" mod p and ft-r-i ~ 9^ mod p, where 17 is a primitive root of this group, 
and a — 0, ... ,p — 2, P = 0, ... ,p — 2. By applying Rader's algorithm, we can 
further reduce the total number of steps to 0{mp"^ logp) [18]. 

3. Function Interpolation 

In this section we consider multivariate spline interpolation over digital nets 
using the discrete Walsh coefficients described in the previous section. Multivariate 
spline interpolation over lattice rules was considered in [291 . See also |23j for more 
information on properties of splines. 

3.1. Reproducing kernel Walsh space. Before we introduce the interpolation 
algorithm we introduce reproducing kernel Hilbert spaces based on Walsh functions, 
see [6]. In the following we define the weighted Hilbert space Hk based on Walsh 
functions. 

Consider the set of functions 

{n'-l 
/ : fix) - "*^(^' • ^ ^0' ^ {^^ito' C [0, 1)^ 
1=0 

defined in terms of a symmetric, positive definite kernel function K : [0, 1)^* C. 
The kernel allows us to define an inner product on Ti.Q_K as 

i=Q j=0 

for any two functions / = ^"J,q^ ctiK{-, x[) and g = Y^^lo^ PiK{-, y[). The linear 
space "Ho^K may then be completed to obtain a Hilbert space, Tix, for which is 
the reproducing kernel, see [T], 

Ki;y)enK, fiy)^{K{;y)J) Vy e [0, l)^ V/ e Hk- 
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The kernel functions considered here are based on Walsh functions and may be 
written as 

K{x, y) = K{x e y, 0) = ^ k{k)wa\k{x Q y), 

where positive definiteness is ensured by requiring K{k) to be real and non-negative 
for all k G Nq. The finiteness of this kernel is ensured by requiring that K be 
summable. The inner product for the Hilbert space defined by this kernel may be 
written as an £2 inner product in the spectral domain: 



{f,9)nK = 



f{k)g{k) _ / / 



K{k) \\/k \fK , 



where / and g are the Walsh coefficients of / and g. The accompanying norm is 
11/11 = {fif)nl- 

3.2. Interpolation of functions in the Walsh space. We now interpolate 
functions in Tipc using a linear combination of the reproducing kernel function 
where the second variable is fixed, K{-,Xn), n — 0, — 1, and where the 

Xn are points taken from a digital net, V{C). Given the values of a function 
f{xn),n = 0, . . . — 1, one can approximately recover it by a spline, defined 
as 

Sf{x)^ ^ CnK{x,Xn), 

71=0 

where the c„ are the coefficients to be found by interpolation: f{x„) = Sf{x„) for 
n — 0, . . . ,p"^ — 1. This translates into solving the linear system 



f{xn) = ^ CyK{xn, Xy) for n = 0, . 



for the coefficients cq, . . . , Cp™_i given the f{xn) and K{xn,Xy). In the following 
paragraphs we show that the c„ can be computed in 0{mp"^ logp) operations. See 
[50] for an analogue for lattice rules in the context of Fredholm integral equations 
of the second kind. 

First observe that for any k e /C(C) the DWT of the function data is 



_ 1 1 

= ~ X! /(a;„)walfe(a;„) ^ — ^ 5/(a;„)walfe(a;„) 

^ n=0 ^ n=0 

^ ~ 53 -^('^ 5Z Crwalfc(a;r) ^ walj(a;„)walfc(a;„) 



p" 

^ JeN= r=0 n=Q 



(3.1) =p'" J2 K{k®d)Z{k)^p"'K{k)Z{k), 

dev(c) 

where c{k) = p^"^ J2n=o^ ^"-^^^kixn) is the DWT of the coefficients, {cn)n=o^ ' 
and K{k) ^ J2iev{C) -^(^ ® is the DWT of the kernel data, (i^(a;„, 0)fnZo^ ■ 
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This implies that the DWT of the coefficients is essentiahy the quotient of the 
DWTs of the function data and the kernel data: 

(3.2) ?(fe)= -^^^^ 



p"^K{k) 

Using the FWT algorithm we can compute all {,f{k))keic(c) ^^nd (K{k))k^ic(c), and 
hence {c{k))keK:(C) in 0{mp"^ logp) operations. To compute co, . . . , Cpm-i requires 
the inversion of the DWT, which is introduced in the next section. 

3.3. Fast inverse discrete Walsh transform. The following lemma gives the 
key to the inverse discrete Walsh transform over a digital net. 

Lemma 3.1. Let P{C) = {xq, ■ ■ ■ ,Xpm_i} be a digital net with points. Let 
Co, . . . ,Cpm_i he arbitrary complex numbers and let {c{k))keK:{C)> denote the DWT 
of {cn)n=o^ ■ Then for n = 0, . . . ,p"^ — 1 the coefficients are 

Cn= ^ c(fe)walfc(a;„). 
keic{C) 

Proof. The sum on the right of the above equation is 

c(A;)walfc(a;„) = ^ Cy— ^ walfc(a;„)walfc(x„) 
keic{C) v=o P kefcic) 



"^^4^ walfc(a;„ e a;„). 



v=0 ^ keK{C) 

The definition of the Walsh function and the net imply that walfe(a;„ Q Xy) = 
walfe(x„0v) = uj^^^^'^, where k e T>(h) fl /C(C). Thus we have 

fee/c(C) h^z'^ 

and the last sum is 1 if n = i.e. n = v, and otherwise. Hence the result 
follows. □ 

The above lemma describes the inversion of the discrete Walsh transform which 
we might call the inverse discrete Walsh transform. As for the discrete Walsh 
transform, there is also a fast inversion of the discrete Walsh transform which we 

describe in the following. 

Algorithm 2 (Fast Inverse Discrete Walsh Transform (FIWT)). For {Hq,. . . , hm-i G 
Zp we define D^°\ho, hm-i) = c(fe) where fe e V{h) n /C(C). Then for r = 
1, 2, . . . , m compute for all hr, . ■ . , hm-i € Zp and all no, • • • , nr-i G Zp the sums 

D^''\no, . . .,nr-l,hr, . . . ,/lm-l) 

p-1 

= ^ a;^'-i"'-i£)('-^)(no,...,nr-2,/ir-i,/ir,-- 

hr— 1=0 

Then for n = 0, . . . — 1 with n = no -\ h nm-ip"^~^ let 

c„ = £)(™)(no,...,n„_i). 
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Hence co, . . . , Cpm-i can also be computed from (c(fc))fceK;(c) in 0{mp™'^^) op- 
erations. As with the FWT algorithm, this number of operations can be further 
reduced to O{mp"^\ogp) if we apply Rader's algorithm. Thus the spline interpo- 
lation at a point, Sf{x), can be computed in 0{mp"^ logp) operations, and each 
additional point can be computed in 0{p™) operations. 

3.4. Best possible interpolation. Splines as defined above provide optimal in- 
terpolation of a function in at least two senses. The spline approximation is the 
smallest function in that interpolates the data: 

S f — min lloll-w • 

Moreover, the spline algorithm is the best linear algorithm, i.e., algorithm of the 
form Af = J2"=o f{xi)wi{x) for any choice of the Wi{x): 

Sf{x) = urgrnin sup \f{x) - Af{x)\ . 

ll/ll„(K-)<i 

The proofs of these assertions are contained in 8, Chapter 18] and elsewhere. 

4. ANOVA Decomposition of the Interpolation 

Based on the spline interpolation of a function, one can also approximate its 
analysis of variance (ANOVA) effects, i.e., the pieces of the function depending on 
a subset of the s variables. This provides a way to estimate the truncation and 
superposition dimensions of functions via their spline interpolations. 

Let 1 : s denote the set of coordinate indices, {1, . . . , s}, for short. Let u denote 
a subset of 1 : s and let u denote 1 : s\u. For any x £ [0, 1]'*, let a;„ = {xj)j^u 
denote the vector of coordinates indexed by u. The ANOVA decomposition [71 [22] 
of a function / : [0, 1]** — > R, is denoted 



(4.1a) f{x) = ^ fu{x.^ 



jCl:, 



where the ANOVA effect, /„, is defined recursively by taking the integral over [0, 1]" 
and then subtracting the lower order effects: 

(4.1b) /0 = / f{x)dx, fuixu) ^ / f{x)dxu-^ fvixy), 

"'[0,l]'- "'[04]" vcu 

We emphasize that C on the right side of this last equation denotes the proper 
subset. Also, [0, 1]" denotes the Cartesian product of |u| copies of [0, 1], where |m| 
is the cardinality of u. 

The ANOVA effects, /„, of the function / e Hk he in subspaces, TCk^ for kernels 
constructed appropriately. The kernels Ku are products of a univariate kernel, K'. 

The kernel K' for univariate functions is defined as 

oc 

/C'(a;, J/) = i^'(.T e y, 0) = ^ F'(fc)walfc(x e y), 

fc=i 

where the Walsh coefficients of the kernel must be non-negative. One reasonable 
choice is 



aa 
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for k — ko + kip + • • • + fcap" G N with ka ^ 0, and where a > 1 is a parameter that 
measures the digital smoothness of the kerneL These Walsh coefficients have been 
normalized so that 



K'{x, x) = K'{0, 0) = ^ K'{k) = 1. 



fe=i 



Because K' does not include the constant function, i.e., K'{0) = implicitly, it 
follows that K'{x,y) dy = 0. A computable short form of K' can be obtained 
[6j, namely for x — xip^^ + X2P^^ + • ■ ■ and y — xip^^ + X2P^'^ + ■ • • + + 
+ Vi+iP^'^^ H with yi ^ x, we have 

K'{x, y) ^ K'{x e y, 0) = 1 - 



p-1 

The kernel for functions of s variables is a product involving K' , namely, 
K{x, y) = K{x e y, 0) = ^ k{k)vrA\k{x Q y) 

s s 

= [][!+ ljK'{x,,y,)] = + e y„ 0)] 

j=i j=i 

= ^ luKu{Xu,yu) = X! luKu{XuQyu.^). 
nCl:s uCl:s 

where the tuning parameter 7„ has a product form, 

and the Walsh coefficients of these multivariate kernels have expressions in terms 
of the Walsh coefficients of K': 

s 

K(k)^\{[5k^,G + ^,K'(k,)]^ J2 7«^(fe«)4,.o, 

j—l uCl:s 

Ku{xu,yJ = Ku{xuQyu,0) = ^ Ku{ku)walk^{x e y) 

^l[K'{x,,y,)^l[K'{x,Qy„0), 

jeu jeu 

K^{ku)^Y[K'{k,), K^ = l. 
jeu 

The spline approximation via this kernel may be expressed as the sum of its 
ANOVA effects: 

—1 — 1 

n— n— uC.l:s 

where these ANOVA effects are written in terms of the kernels Ku'- 

p"^ — 1 p''^ — 1 

(4.2) {Sr)u{xu)=lu J2 ^"^"( ) = 7n X! c„/i:„(x„ ex„,„,o). 

n=0 n=0 
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These {Sf)u may be verified as the ANOVA effects of Sf defined in (|4.ip by noting 
that K'{x, y)dy — for all x. 

The special form of the reproducing kernel defined here facilitates the calculation 
of the variance of {Sf)u, denoted a^{{Sf)u)- Noting that for |m| > 0, (S'/)„ has 
zero mean, it follows that 



dxu 



[0,1]" 

p-^-l 



~ f 



n,^— 

p^-i 



iiY. ^"^"H r^'( 

n,v— j^u ^ 



Substituting the Walsh expansions of the univariate kernels in the above integral 
and noting that the Walsh functions are orthogonal yields an expression for the 
integral in terms of a related univariate kernel: 



K {^Xj^Xji j^K {xj^x^^j^dxj — R (xt^^j , x^^j ) , 



where 



R'{x, y) = R'{x e 0) = ^ E'(A:)walfe(x y), ^(fc) = [F'(fc) 



fe=0 



^(0) = and R'{k) = 



p ~ p 



„-2aa 



for fc = fco + fciP + • • • + fcaP" G N with ka ^ 0. Moreover, as is the case for K' ^ 
there is a computationally simple form for R! as well. For x = x\p^^ + X2P~^ + • • • 



and y = xip ^ + X2P + • • • + x 
we have 

R'{x,y)^R'{xQy,Q) 



with j/j 7^ Xi 



l-p 



i{l~2a)P 



,2a 



p-l 



(p-l)(p2"-p) 

The product of the univariate kernels R' is then used to define the kernels 
Ruixu^Vu) ^ Ru{xuQyu^Q) ^W_R'{xj,yj) ^ ^ i?u (fc«)walfc„ (a;„ e y„), 



Ru{ku) = \{R'{kj 



jeu 



This definition allows the variance of (S'/)„ to be written in terms of the kernel R^ 



p"-i 



n,v—0 

The sum above may be evaluated naively using 0{p^'^) operations. However, 
using the FWT allows evaluation with only C(m^^"^ log^?) operations. First, the 
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spline coefficients, c„ and Ci, are written in terms of their inverse discrete Walsh 
transform coefficients via Lemma 13.11 

((£'/)„) = 7u X! X! c(fc)walfc(cc„) ^ c{l)wa\i{xy)Ru{xn,u,Xy.u) 
n,v=Ok<£K:{C) leicic) 



= T« X! c(fc)c(0 ^ i?«(a;n,ti ea;„^„,0)walfc(a;„)walj(a;i,). 

k,lGK:(C) n,v=0 

The double sum of the kernel Ru may be further simplified by applying certain 
elementary properties of Walsh functions for any k,l £ JC{C): 

Ru{Xn,u Q Xy^u,0)^a\k{Xn)wa\l{Xy) 

n,v—0 

= ^ Ru{xn,uQXy^u,0)^a\k{xnQXy^u)^aheii^v) 

n,v—0 

xeP(C) n,t)=0 

where the DWT of Ru is 

= ^ ^ i?n(a;„,0)walfc(a;). 

^ xeP(c) 

Substituting the formula for the double sum of the kernel R^ in terms of its DWT 
yields 

(4.3) ^ \Z{k)fRu{k). 

keic{c) 

Since the DWT of the spline coefficients and Ru may each be calculated in 0{mp"^ logp) 
operations by the FWT algorithm, it follows that each a'^{{Sf)u) may be calculated 
in 0{mp™\ogp) operations. 

Calculating the variances of all the S fu may be too burdensome, since there are 
2^* ANOVA effects. However, sums of the a^{{Sf)u) are useful for determining the 
effective dimension of / which is an important factor in the performance of 
quasi-Monte Carlo methods. 

The truncation variance of order d, denoted a1^^{f]d) is defined as sum of the 
variances of all ANOVA effects involving the first d or fewer variables: 

«Cl:d 

The superposition variance of order d, denoted a1^^{f\d) is defined as sum of the 
variances of all ANOVA effects involving d or fewer variables: 

^sup(/;^^)= E ^'(/-)- 

0<\u\<d 
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It follows from these two definitions that the truncation variance is no more than 
the superposition variance, and for all d between and s, 

= 4e(/;0) = ^.'up(/;0) < 'jLif-.d) < <j'Lp{f;d} 

<4,(/;s) = a2,p(/;s) = a2(/). 

The truncation dimension, dtrc, and the superposition dimension, dgup are defined 
as the smallest dimensions for which the truncation and superposition variances, 
respectively, are 99% of the total variance of the function, i.e., 

4.,(/;dfc - 1) < 0.99a2(/) < crlMldtrc), 
alM;dsup - 1) < 0.99a2(/) < a2„p(/;4up). 

The truncation and superposition dimensions of a function may be estimated by 
the truncation and superposition dimensions of their spline approximations. To do 
this requires computationally efficient formulas for the truncation and superposition 
variances. The truncation variance may be written as 

uCl:d uCl-.d keK{C) 

= p2™ ^ \Z{k)\'R^{k), 



where Rd is the DWT of the kernel Rd defined as 



Rd{x,y) = Rd{xQy,0) 

uC.l:d uC.l:d 
d d 

By convention R{x, y) is defined as Rs{x, y). Note that the total variance of the 
spline approximation to / is given by taking d = s above: 



(4.4) a'{sf)=p'"^ E mfm^p'"- E i/wi' 

fce/c(c) fcGK(C) 



R{k) 

where p.2p has been used. 

For jj of the form Pjj and thus 7„ = /3'"'7„ the DWT R{k) may be written as 
an s-degree polynomial in with vanishing constant term: 

s 

R{x,y) = R{xey,0)=J2(^^' E fMxuQyu.O), 



J=l uCl:s 



i?(fc) = E/5'-''Q(^'j')' 
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where the Q{k,j) are the coefficients of this polynomial in and are given by 

uCll-.s 

The coefficients Q{k,j) may be obtained by evaluating R{k) at s different values 
of , and then performing polynomial interpolation, a relatively inexpensive pro- 
cedure requiring 0{s^) operations. These Q{k,j) may then be used to evaluate the 
superposition variance as follows: 

0<|M|<d 0<|M|<d keK{C) 

E 1^(^)1' E /^""'7^i4(fc) 

k£K{C) 0<|u|<d 
d 

keK{C) j=l 

Thus, both the truncation and superposition dimensions may be evaluated via the 
FWT algorithm using ©(s^mp™ log(p)) operations. 

5. Numerical Results 

5.1. Optimization of the Kernel Parameters. The 7^ are weights in the ANOVA 
decomposition of the kernel and a controls the rate of decay of Fourier coefficients 
of the kernel. Their optimal values for spline interpolation depend on the partic- 
ular function / to be interpolated. Assume that p — 2, and assume that we have 
a digital net with 27V points, {xq, . . . , X2Ar_i}, whose first N points themselves 
constitute a digital net, as do the second N points. We can construct the spline 
intcrpolant by the values of f{xn) for n — 0, . . . , N — 1, and then estimate the error 
of the spline by the cost function 

(5.1) E (/(Xn)-^/(x„))^ 

n=N+l 

The values of Sf{xn) iorn = N, . . . , 2N—1 can be evaluated by a fast discrete Walsh 
transform. There are s optimization parameters 7^ and the optimization process 
becomes slow when s increases. In order to reduce the number of optimization 
parameters, the values of jj are assumed to be of the following form in this section, 

7, = Pf 

where g is a new optimization parameter. Thus, for any s, there are only three 
optimization parameters: /3, a and q. Their optimal values in Sections 15.21 and 15.31 
were found by optimizing (|5.ip using the function fminsearch in MATLAB. 
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ak 


s 




C^trc 




^sup 












*»i7^ 1 m 




opllIlG 




10 


10 


10 


10 


3 


2 


1 


20 


20 


20 


20 


5 


2 




40 


40 


39 


40 


8 


2 




10 


10 


10 


10 


2 


2 


k 


20 


18 


19 


18 


2 


2 




40 


33 


34 


31 


2 


2 




10 


5 


5 


5 


2 


2 




20 


5 


5 


5 


2 


2 




40 


5 


5 


5 


2 


2 



Table 1. The effective dimensions of the multiplicative functions 



5.2. EfTective Dimension of Multiplicative Functions. Consider a class of 
test functions in [25] 



- Qfc 
ak 



k=l 

where Ofc are parameters. We consider three possible choices of Ofc: = 1, Ofc = fc 
and flfc = fc^ for fc = 1 . . . s. The effective dimension in both senses of these functions 
can be computed analytically, or estimated by the algorithm in 25J. We compare 
them with the effective dimensions of the interpolating spline as described in the 
previous section. Table [T] shows the results using a Sobol point set with to = 12 
and p = 2, so AT = 4096. 

Note that dgup cannot be calculated by the method in [25] . The spline method de- 
scribed in the previous section can estimate both dsup and dtrc well in all cases except 
for c?sup in the case ak — 1 and s — 20, 40. Table|2]shows cr^{f) by analytical calcula- 
tion and two approximations: CTqmcI/) = Ji J2n=o P{xn)~{^ J2n=o f{'^n)f and 
(y^iSf) by (14. 4p . Table 12] also shows the ratio cr"^ {S f) / aqy^jif) as an indicator of 
how well these two approximations agree. One can prove that o''^ {S f) / aQ^Q{f) < 
1. It is found that the values of cr^ [S f) / crQ-^df) are close to unity for most cases 
but are extremely small for = 1, s = 20,40. It was reported in that the 
errors of the numerical integration by the QMC methods were also large for these 
latter cases. 

5.3. Asian Option Pricing. The pricing of an Asian option is based on the arith- 
metic average of the stock prices in a particular period of time. The payoff of the 
call option at the end of period T is 



payoff — max j - ^ Sj — 



where Sj is the stock price at time tj = jT/s,j = 0, . . . , s, ts — T and K is the 
strike price. Based on the risk-neutral valuation principle, the price of the call 
option at i = should be 

i?(e-'^^payoff). 
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en 


"QMC W ) 


" \'~'J) 






10 


1.2265 


1.9990 


1.0434 


0.5220 


1 


20 


3.9573 


2.7 X 10^ 


0.2692 


9.9704 X 10-5 




40 


23.5745 


2.98 X 10^° 


0.2393 


8.0302 X 10-12 




10 


0.1992 


0.2025 


0.1960 


0.9679 


k 


20 


0.2154 


0.2340 


0.2073 


0.8859 




40 


0.2246 


0.3134 


0.2088 


0.6662 




10 


0.1038 


0.1039 


0.1037 


0.9981 


e 


20 


0.1039 


0.1040 


0.1038 


0.9981 




40 


0.1039 


0.1041 


0.1038 


0.9971 



Table 2. The variances of the multiphcative functions estimated 
by a QMC method and the variances of the sphncs. 





Wang 


& Fang 


Sphne 


s 


standard 


BB PCA 




8 


7 


5 2 


7 2 


16 


14 


7 2 


14 2 


32 


27 


7 2 


27 2 



Table 3. The estimated effective dimensions of the Asian option 
pricing problem. 



where E{-) is the expected value of different path movements of the stock price and 
r is the risk-free interest rate. The stock price movement is assumed to follow a 
geometric Brownian motion, 

where a is the volatility, and Zi, . . . , Zs are independent standard normal random 
variables. See [27] for the details. 

Table |3] shows the results of the estimated effective dimension in both senses for 
an Asian option pricing problem with the following parameters: N — 2^"^ , Sq — K = 
100,(7 = 0.2, r = 0.1, and T = 1 year using a Sobol point set. The results in [25] 
are also duplicated. The column "standard" is the estimated truncation dimension 
of the original pricing function by their method. The columns "BB" and "PCA" 
are the estimated truncation dimensions when the Brownian bridge and principle 
component analysis dimension reduction methods are applied, respectively. The dtrc 
of the spline are exactly the same as the results in [^S] . The dgup are also the same as 
the truncation dimension after applying PCA, which is the best dimension reduction 
method in [25j . Moreover, the superposition dimension dsup = 2 is consistent with 
the analytical results in [27]. The variances of the discounted payoff for the Asian 
option pricing problem computed by the sample variance and the variance of the 
spline are shown in Table [U which indicate that the accuracy of the spline for 
estimating effective dimension should be reasonably good. 
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s 


'^QMc(/) 




'^'(^/)Kmc(/) 


8 


70.3669 


69.4148 


0.9865 


16 


71.6402 


69.0217 


0.9634 


32 


72.6395 


67.8101 


0.9335 



Table 4. The variances of the Asian option price problem esti- 
mated by a QMC method and the variances of the splines. 



6. Conclusion and Remarks 

A fast discrete Walsh transform over digital nets is derived in this article. This 
transform can be applied to reduce the cost of calculating the discrete Walsh coef- 
ficients from 0{N'^) to 0(7Vlog A^). Because the kernel used is piecewise constant, 
the spline interpolant based on this kernel may not be particularly accurate. How- 
ever, the spline interpolant does facilitate the estimation of coarser quantities, such 
as the variances of the ANOVA effects and the effective dimension of the function. 
The numerical results for two different families of functions are compared with the 
analytical results and the results from the literature showing that the estimation 
of effective dimension via the spline is accurate and efficient. 

Finally, we would like add two remarks. The function values in our interpola- 
tion method were sampled over a digital net, which was originally constructed for 
numerical quadrature. Sparse grids [2] are another sampling scheme and approx- 
imation method that can be applied to both numerical quadrature and function 
interpolation. Both methods can be classified as a priori grid optimization [2]. 
However, the point sets are based on different criteria. Digital nets minimize the 
discrepancy of the point set [13], while the sparse grid is selected based on the 
importance of the components in a tensor product of hierarchical function spaces. 
Sparse grids use 0{h~^ ilogh^^Y^^) points where h is the mesh size in each di- 
mension. This number is substantially smaller than 0(h~''), the number of points 
needed for an ordinary grid, but the number of points for a sparse grid increases 
exponentially with dimension for a given mesh size. On the other hand there exist 
digital nets with A'^ = points for to = 0, 1, . . ., independent of the dimension, s. 
Error bounds have been developed for sparse grid methods. An error analysis of 
the spline algorithm for digital nets is the object of future work. 

The second remark is that the fast algorithms in this paper are due to the cor- 
respondence between the point set and the kernel. In [29j , a similar technique was 
applied to integration lattices and the fast Fourier transform (FFT) . For computer 
experiments, one typically has control over where to sample the underlying func- 
tion, but for some problems of approximating a function based on observational 
or field data, the locations of the points where the function is sampled can not be 
selected in advance. Such problems are suited to the Nonequidistant Fast Fourier 
Transform (NFFT) pioneered by Potts [T7]. The NFFT provides a fast and rela- 
tively accurate, but only approximate, discrete Fourier transform of the data. The 
FFT applied to data sampled on an integration lattice and the Fast Walsh Trans- 
form (FWT) introduced here applied to data sampled on digital nets both provide 
fast discrete transforms with accuracy only limited by machine precision. 
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